Introduction

A comparison of JULES-ES-1p0 wave01 members against the original ensemble (wave00).

Wave 01 input parameter sets were picked using History matching to fall within Andy Wiltshires basic constraints on NBP, NPP, cSoil and cVeg stocks at the end of the 20th century. We use 300 of the 500 members, keeping back 2/5ths for emulator validation later.

We answer some basic questions.

What proportion of the new ensemble match AW’s constraints?

How good is a GP emulator? Does it get better overall with the new ensemble members added? In particular, does it get better for those members within the AW constraints?

Does the sensitivity analysis change?

Preliminaries

Load libraries, functions and data.

How many run failures were there?

There are no NAs but some relative humidity values are infinite. There are no “low NPP” ensemble members

## [1] 117464.6
## [1] FALSE
##      row col
## [1,] 140   9
## [2,] 232   9
## [3,] 249   9
## [4,] 300   9
## [1] Inf Inf Inf Inf
## [1] "rh_lnd_sum"

Ensemble behaviour in key (constraining) outputs.

Global mean for the 20 years at the end of the 20th Century. There is still a significant low bias on cVeg output.

### What proportion of models now fall within Andy’s constraints?

A third! Better than before, but still not great. Pointing at a significant model discrepency in cVeg

Of the 300 members of the wave01 ensemble, 100 pass Andy Wiltshire’s Level 2 constraints.

## [1] 100

Pairs plot of the inputs that pass the constraints with respect to the limits of the original ensemble.

Timeseries of mean carbon cycle properties over whole run.

Wave00 in blue and wave01 in red.

Plot Wave00 and Wave01 on top of one another.

Emulator fits

We hope that running the new ensemble gives us a better emulator, and allows us to rule out more input space. We particularly hope that the emulator is better for those members that are inside AW’s constraints.

First, we can look at the emulator errors in two cases: The level1a data (a basic carbon cycle), and then with the Wave01 data, which should have similar characteristics. (We should have eliminated really bad simulations, but wave01 is not constrained the data perfectly to be within AW constraints.)

Emulator fit list of level 1a ensemble

Remove an outlier from the new wave and build emulators

##      nbp_lnd_sum npp_nlim_lnd_sum    cSoil_lnd_sum     cVeg_lnd_sum 
##              583              213              317              312
##      nbp_lnd_sum npp_nlim_lnd_sum    cSoil_lnd_sum     cVeg_lnd_sum 
##              314              243              243              243

Found the outlier - looks like it’s 440

## integer(0)

Leave-one-out analyses of emulator prediction accuracy

The top row shows the leave-one-out prediction accuracy of the original wave00 ensemble, and the lower row the entire wave00 AND wave01 ensemble combined.

Emulator accuracy of members from wave 00 and wave 01 that pass level 2 (AW’s) constraints

We see that the error stats for some of the outputs from wave01 are worse, but there are many more ensemble members that lie within the constraints for wave 01.

“pmae” is “proportional mean absolue error”, which is the mean absolute error expressed as a percentage of the original (minimally constrained) ensemble range in that output.

Does the emulator improve is you look at only the 37 members that pass level 2 constraints in wave 00?

This gives us an idea of how good the emulator is where it really matters, and as the members are consistent, gives us a fairer idea of whether the emulators have improved with more members.

Good news is, the emulators are more accurate for wave01.

These leave-one-out prediction accuracy plots rank the ensemble members from largest underprediction to largest overprediction using the wave00 predictions. A perfect prediction would appear on the horizontal “zero” line.

Many of the wave01 predictions are closer to the horizontal line, and therefore more accurate predictions.

None of the predictions are outside the uncertainty bounds, which suggests they are overconservative (should be smaller).

Looking at the proportional mean absolute error (pmae), expressed in percent, we can see that it doesn’t improve much for the whole ensemble, but does improve significantly for the subset of ensemble members that fall within AW’s constraints from the first ensemble (marked "_sub").

pmae_wave00 <- lapply(loostats_km_Y_level1a, FUN = function(x) x$pmae )
pmae_wave01 <- lapply(loostats_km_Y_level1a_wave01, FUN = function(x) x$pmae )

pmae_wave00_sub <- lapply(loostats_km_Y_level1a_sub, FUN = function(x) x$pmae )
pmae_wave01_sub <- lapply(loostats_km_Y_level1a_wave01_sub, FUN = function(x) x$pmae )

pmae_table <- cbind(pmae_wave00, pmae_wave01, pmae_wave00_sub, pmae_wave01_sub)

print(pmae_table)
##      pmae_wave00 pmae_wave01 pmae_wave00_sub pmae_wave01_sub
## [1,] 5.094064    4.92765     7.167737        4.913284       
## [2,] 4.282047    4.007731    4.804376        4.085918       
## [3,] 3.597198    3.79032     4.555865        3.833893       
## [4,] 4.241928    4.516057    4.816467        3.226295

Comparing atmospheric growth in wave00, wave01 and observations

Andy asks - what constraint does that give us in cumulative NBP?

Eddy suggests measuring cumulative NBP against atmospheric growth rate

Calculate the atmospheric growth rate of 1984- 2013 using a simple linear fit

## [1] "correlation agr vs cnbp (all members)"
## [2] "-0.0407011946805412"
## [1] "correlation agr vs cnbp (wave01)" "0.00334447281747124"
## 
## Call:
## lm(formula = agr_modern_ens ~ cnbp_modern_ens)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.047413 -0.003802  0.003838  0.006715  0.023090 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      1.114e-01  5.176e-04 215.173   <2e-16 ***
## cnbp_modern_ens -1.379e-09  1.519e-09  -0.908    0.364    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.01155 on 497 degrees of freedom
## Multiple R-squared:  0.001657,   Adjusted R-squared:  -0.0003522 
## F-statistic: 0.8247 on 1 and 497 DF,  p-value: 0.3643
## 
## Call:
## lm(formula = agr_modern_wave01 ~ cnbp_modern_wave01)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
##   -4639   -4639   -4639   -4639 1382335 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)
## (Intercept)        4.639e+03  4.639e+03   1.000    0.318
## cnbp_modern_wave01 5.475e-29  9.483e-28   0.058    0.954
## 
## Residual standard error: 80210 on 298 degrees of freedom
## Multiple R-squared:  1.119e-05,  Adjusted R-squared:  -0.003344 
## F-statistic: 0.003333 on 1 and 298 DF,  p-value: 0.954

Interannual variability and cumulative NBP

(correlations are close to zero, especially in the later wave)

## [1] 0.03223847
## [1] 0.003344484

How close can we get the model to reality?

Using Atmospheric Growth Rate as an example, how close can we get the model to observations? Can we do better than standard? What are the trade offs of doing so? How does getting close in AGR affect performance in other outputs?

We’ve established that most of the original ensemble have an ME/MAE/RMSE larger than the standard run. More (but few) of the wave01 perform better than standard.

A map of the 2D projections of parameter space where the ensemble member performs better than standard.

The blue part is the first wave, and not subject to constraint so may be removed in the second wave (wave01).

Build emulators and find parts of parameter space that to better than standard at atmospheric growth.

Having trouble fitting RMSE, to trying mean error.

Why is there an odd collection at just under 1?

## [1] 100

This next pairs plot looks at all the ensemble members that have a better mean atmospheric growth error than standard.

This next plot looks at all the ensemble members that have a better mean atmospheric growth error than standard AND pass the level 2 constraints.

The number is small (41/300), but the ensemble members seem spread across parameter space.

## 
## optimisation start
## ------------------
## * estimation method   : MLE 
## * optimisation method : BFGS 
## * analytical gradient : used
## * trend model : ~alpha_io + a_wl_io + bio_hum_cn + b_wl_io + dcatch_dlai_io + 
##     dqcrit_io + dz0v_dh_io + f0_io + fd_io + g_area_io + g_root_io + 
##     g_wood_io + gs_nvg_io + hw_sw_io + kaps_roth + knl_io + lai_max_io + 
##     lai_min_io + lma_io + n_inorg_turnover + nmass_io + nr_io + 
##     retran_l_io + retran_r_io + r_grow_io + rootd_ft_io + sigl_io + 
##     sorp + tleaf_of_io + tlow_io + tupp_io + l_vg_soil
## * covariance model : 
##   - type :  matern5_2 
##   - nugget : NO
##   - parameters lower bounds :  1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 1e-10 
##   - parameters upper bounds :  1.978915 1.984434 1.985775 1.678886 1.961948 1.960223 1.976754 1.774573 1.971703 1.983945 1.989915 1.994414 1.970738 1.986029 1.924591 1.992288 1.984011 1.960324 1.994092 1.983781 1.953327 1.972882 1.972018 1.982383 1.986984 1.982191 1.988709 1.9928 1.987255 1.995015 1.96482 1.952566 
##   - best initial criterion value(s) :  -93.24008 
## 
## N = 32, M = 5 machine precision = 2.22045e-16
## At X0, 0 variables are exactly at the bounds
## At iterate     0  f=        93.24  |proj g|=       1.7935
## At iterate     1  f =       90.829  |proj g|=        1.8809
## At iterate     2  f =       90.029  |proj g|=        1.2973
## At iterate     3  f =       88.834  |proj g|=        1.2169
## At iterate     4  f =        87.74  |proj g|=        1.9017
## At iterate     5  f =       87.328  |proj g|=        1.8886
## At iterate     6  f =       87.032  |proj g|=        1.7518
## At iterate     7  f =       86.892  |proj g|=        1.6985
## At iterate     8  f =       86.811  |proj g|=        1.5798
## At iterate     9  f =       86.612  |proj g|=        1.6548
## At iterate    10  f =       86.249  |proj g|=        1.8591
## At iterate    11  f =       86.098  |proj g|=       0.97284
## At iterate    12  f =       85.721  |proj g|=         1.036
## At iterate    13  f =       85.649  |proj g|=       0.99851
## At iterate    14  f =        84.36  |proj g|=        1.9412
## At iterate    15  f =       83.877  |proj g|=          1.94
## At iterate    16  f =       83.822  |proj g|=       0.90083
## At iterate    17  f =       83.756  |proj g|=       0.93582
## At iterate    18  f =       83.723  |proj g|=        1.9408
## At iterate    19  f =       83.704  |proj g|=        1.4108
## At iterate    20  f =        83.67  |proj g|=         1.658
## At iterate    21  f =       83.533  |proj g|=        1.6617
## At iterate    22  f =       83.059  |proj g|=        1.6542
## At iterate    23  f =       82.802  |proj g|=        1.0564
## At iterate    24  f =       82.579  |proj g|=         1.931
## At iterate    25  f =       82.534  |proj g|=        0.6187
## At iterate    26  f =       82.399  |proj g|=       0.67348
## At iterate    27  f =       82.239  |proj g|=       0.64738
## At iterate    28  f =       82.133  |proj g|=        1.5759
## At iterate    29  f =       82.109  |proj g|=         1.916
## At iterate    30  f =       82.086  |proj g|=        1.9171
## At iterate    31  f =       81.952  |proj g|=        1.5468
## At iterate    32  f =       81.601  |proj g|=       0.87797
## At iterate    33  f =       81.573  |proj g|=       0.84722
## At iterate    34  f =        81.56  |proj g|=       0.87025
## At iterate    35  f =       81.535  |proj g|=       0.92246
## At iterate    36  f =       81.485  |proj g|=        0.9065
## At iterate    37  f =       81.347  |proj g|=         1.939
## At iterate    38  f =       81.321  |proj g|=        1.6877
## At iterate    39  f =       81.315  |proj g|=        1.4009
## At iterate    40  f =       81.288  |proj g|=        0.8055
## At iterate    41  f =       81.243  |proj g|=       0.78044
## At iterate    42  f =       81.162  |proj g|=       0.80385
## At iterate    43  f =       81.128  |proj g|=        1.9409
## At iterate    44  f =        80.97  |proj g|=        1.9428
## At iterate    45  f =       80.919  |proj g|=        1.9413
## At iterate    46  f =       80.737  |proj g|=        1.6776
## At iterate    47  f =        80.66  |proj g|=         1.932
## At iterate    48  f =        80.57  |proj g|=        1.6616
## At iterate    49  f =       80.444  |proj g|=        1.9315
## At iterate    50  f =       80.435  |proj g|=       0.75878
## At iterate    51  f =       80.396  |proj g|=       0.74757
## At iterate    52  f =       80.385  |proj g|=        1.9292
## At iterate    53  f =       80.378  |proj g|=        1.9289
## At iterate    54  f =       80.348  |proj g|=       0.83962
## At iterate    55  f =       80.321  |proj g|=       0.92286
## At iterate    56  f =       80.272  |proj g|=        1.6572
## At iterate    57  f =       80.167  |proj g|=        1.0701
## At iterate    58  f =       80.145  |proj g|=       0.86087
## At iterate    59  f =       80.142  |proj g|=        1.0395
## At iterate    60  f =       80.134  |proj g|=       0.78875
## At iterate    61  f =       80.105  |proj g|=        0.6841
## At iterate    62  f =         80.1  |proj g|=       0.40399
## At iterate    63  f =       80.084  |proj g|=       0.37037
## At iterate    64  f =       80.065  |proj g|=       0.36352
## At iterate    65  f =        80.06  |proj g|=       0.36819
## At iterate    66  f =       80.058  |proj g|=       0.36825
## At iterate    67  f =       80.032  |proj g|=       0.38653
## At iterate    68  f =       79.994  |proj g|=       0.39826
## At iterate    69  f =       79.956  |proj g|=       0.39516
## At iterate    70  f =       79.886  |proj g|=       0.57475
## At iterate    71  f =       79.885  |proj g|=       0.54778
## At iterate    72  f =       79.882  |proj g|=       0.53274
## At iterate    73  f =       79.877  |proj g|=       0.42122
## At iterate    74  f =       79.873  |proj g|=        0.3397
## At iterate    75  f =       79.869  |proj g|=       0.34778
## At iterate    76  f =       79.865  |proj g|=       0.23027
## At iterate    77  f =       79.865  |proj g|=        1.1721
## At iterate    78  f =       79.864  |proj g|=       0.53876
## At iterate    79  f =       79.863  |proj g|=       0.33656
## At iterate    80  f =       79.861  |proj g|=       0.23533
## At iterate    81  f =        79.86  |proj g|=       0.17846
## At iterate    82  f =       79.857  |proj g|=      0.079675
## At iterate    83  f =       79.857  |proj g|=       0.22943
## At iterate    84  f =       79.857  |proj g|=       0.20128
## At iterate    85  f =       79.856  |proj g|=       0.17702
## At iterate    86  f =       79.856  |proj g|=       0.18723
## At iterate    87  f =       79.853  |proj g|=       0.15436
## At iterate    88  f =       79.849  |proj g|=       0.18165
## At iterate    89  f =       79.849  |proj g|=        0.1849
## At iterate    90  f =       79.848  |proj g|=       0.21643
## At iterate    91  f =       79.847  |proj g|=       0.22951
## At iterate    92  f =       79.846  |proj g|=       0.15604
## At iterate    93  f =       79.846  |proj g|=       0.38911
## At iterate    94  f =       79.845  |proj g|=       0.10436
## At iterate    95  f =       79.845  |proj g|=      0.080096
## At iterate    96  f =       79.845  |proj g|=      0.047201
## At iterate    97  f =       79.845  |proj g|=      0.047813
## At iterate    98  f =       79.845  |proj g|=       0.04842
## At iterate    99  f =       79.845  |proj g|=      0.048713
## At iterate   100  f =       79.845  |proj g|=       0.17138
## At iterate   101  f =       79.845  |proj g|=      0.047637
## At iterate   102  f =       79.845  |proj g|=      0.047286
## At iterate   103  f =       79.845  |proj g|=      0.046223
## At iterate   104  f =       79.845  |proj g|=      0.064501
## At iterate   105  f =       79.844  |proj g|=       0.14987
## At iterate   106  f =       79.844  |proj g|=       0.27457
## At iterate   107  f =       79.842  |proj g|=       0.39495
## At iterate   108  f =       79.841  |proj g|=       0.26122
## At iterate   109  f =        79.84  |proj g|=       0.59052
## At iterate   110  f =       79.839  |proj g|=       0.22896
## At iterate   111  f =       79.838  |proj g|=        0.2501
## At iterate   112  f =       79.838  |proj g|=        0.2219
## At iterate   113  f =       79.837  |proj g|=       0.13271
## At iterate   114  f =       79.836  |proj g|=       0.16465
## At iterate   115  f =       79.835  |proj g|=      0.092336
## At iterate   116  f =       79.835  |proj g|=      0.038708
## At iterate   117  f =       79.835  |proj g|=      0.023794
## At iterate   118  f =       79.835  |proj g|=      0.048634
## At iterate   119  f =       79.835  |proj g|=      0.043556
## At iterate   120  f =       79.835  |proj g|=      0.033871
## At iterate   121  f =       79.835  |proj g|=      0.024117
## At iterate   122  f =       79.835  |proj g|=      0.022181
## At iterate   123  f =       79.835  |proj g|=      0.020092
## At iterate   124  f =       79.835  |proj g|=      0.023425
## At iterate   125  f =       79.835  |proj g|=       0.04303
## At iterate   126  f =       79.835  |proj g|=      0.042936
## At iterate   127  f =       79.835  |proj g|=      0.012197
## At iterate   128  f =       79.835  |proj g|=       0.01442
## At iterate   129  f =       79.835  |proj g|=      0.013389
## At iterate   130  f =       79.835  |proj g|=      0.033173
## At iterate   131  f =       79.835  |proj g|=      0.029284
## At iterate   132  f =       79.835  |proj g|=     0.0055237
## At iterate   133  f =       79.835  |proj g|=     0.0041017
## At iterate   134  f =       79.835  |proj g|=      0.010269
## At iterate   135  f =       79.835  |proj g|=     0.0089332
## At iterate   136  f =       79.835  |proj g|=     0.0069077
## At iterate   137  f =       79.835  |proj g|=     0.0089309
## 
## iterations 137
## function evaluations 164
## segments explored during Cauchy searches 145
## BFGS updates skipped 0
## active bounds at final generalized Cauchy point 25
## norm of the final projected gradient 0.00893091
## final function value 79.835
## 
## F = 79.835
## final  value 79.834970 
## converged

This pairs plot shows the 2d and marginal density of emulated input points where the emulated atmospheric growth is closer to the observations than the standard model.

This technique might provide a useful set of points for optimising the model (at least to atmospheric growth).

Next, check emulators of all the other outputs and apply the constraints to them. See how the constraints change.

This is all constraints